Trainer + Multi image v0.1.0 by Blaizzy · Pull Request #41 · Blaizzy/mlx-vlm

Blaizzy · 2024-06-15T14:18:53Z

This PR adds:

LoRA and QLoRA fine-tuning
Multi-image support
Batch processing
image resizing

New Models

Pixtral
Qwen2-VL
Llava-Interleave

Closes #73 #69

…/tuner

lin72h · 2024-06-15T14:20:57Z

It's happening!

Blaizzy · 2024-06-15T14:27:42Z

Absolutely! 🚀

…/tuner

yukiarimo · 2024-09-20T05:15:28Z

Any updates? Is it usable?

Blaizzy · 2024-09-20T07:54:24Z

Hey @yukiarimo

It's almost done.

I just want to run some extra tests (QLoRA, full finetune) and finish Qwen2-VL to release it.

…/tuner

1. [IMG_BREAK] and [IMG_END] are lost after embedding 2. image position encode should be done per image base https://github.com/mistralai/mistral-inference/blob/main/src/mistral_inference/vision_encoder.py#L85 https://github.com/huggingface/transformers/blob/main/src/transformers/models/pixtral/modeling_pixtral.py#L492 Co-authored-by: Roger Xu <rogerxu@gmail.com>

* remove torch and mlx-lm * remove torch and mlx-lm * add peft model creation * use tree flatten * add dataset loader * fix dataset * fix masks and rename dataset * support batch processing and train on completions * fix trainer * formatting * add support for none splits and fix assistant id * Add lora script and docs * remove torch and mlx-lm * add peft model creation * use tree flatten * add dataset loader * fix dataset * fix masks and rename dataset * support batch processing and train on completions * fix trainer * formatting * add support for none splits and fix assistant id * Add lora script and docs * remove duplicates * fix batch load * load trained adapters and add super to all models * fix pixtral quant * speed up qwen batch processing * fix qlora training * fix dataloader * formatting * fix pixtral pixel loading * fix lora and dataset * add batch processing suppor for qwen2_vl * update lora docs * add unit tests * set stage for phi3_v support * update logs and readme * add utils tests and remove unused collate fn * refactor prompt utils and add multi-image support for pixtral * add llava interleave support * multi image support * add image resizing * refactor data loading * update data procesing and tqdm * add llava interleave * formatting * add list of models with multi-image support * remove trimmed labels * remove warning * pin reqs * add config dict condition * fix pixtral FT prompt * formatting images * remove unused * update trainer init * update lora * update md and formatting * bump version * add tests for pixtral and qwen2_vl * add tests for pixtral * Merge branch 'pc/tuner' of https://github.com/Blaizzy/mlx-vlm into pc/tuner * fix test * remove rope scaling * remove test args and update MD * format dataset defaults * add dataset formatting info * Fix issues with multiple image handling (Blaizzy#78) 1. [IMG_BREAK] and [IMG_END] are lost after embedding 2. image position encode should be done per image base https://github.com/mistralai/mistral-inference/blob/main/src/mistral_inference/vision_encoder.py#L85 https://github.com/huggingface/transformers/blob/main/src/transformers/models/pixtral/modeling_pixtral.py#L492 Co-authored-by: Roger Xu <rogerxu@gmail.com> * fix styling * update model * update default model * rewrite comments --------- Co-authored-by: hiima234 <98786318+hiima234@users.noreply.github.com> Co-authored-by: Roger Xu <rogerxu@gmail.com>

Blaizzy added 4 commits May 26, 2024 17:16

remove torch and mlx-lm

2f50f10

remove torch and mlx-lm

d14849f

Merge branch 'pc/tuner' of https://github.com/Blaizzy/mlx-vlm into pc…

2c72233

…/tuner

add peft model creation

2391df4

Merge branch 'pc/tuner' of https://github.com/Blaizzy/mlx-vlm into pc…

f5613eb

…/tuner

Blaizzy mentioned this pull request Jul 4, 2024

[Feature Request] Supports fine-tuning #49

Closed

Blaizzy added 9 commits July 7, 2024 16:55

use tree flatten

5fcaed2

add dataset loader

a88029f

Merge branch 'main' into pc/tuner

3d29a20

fix dataset

9aa5072

Merge branch 'pc/tuner' of https://github.com/Blaizzy/mlx-vlm into pc…

3c4df2a

…/tuner

fix masks and rename dataset

911eaaa

support batch processing and train on completions

8fa9bb9

fix trainer

bf9bed6

formatting

f00252d

Blaizzy added 2 commits September 28, 2024 16:47

add support for none splits and fix assistant id

f206ded

Add lora script and docs

dab901c

Blaizzy marked this pull request as ready for review September 28, 2024 15:20

Blaizzy mentioned this pull request Sep 29, 2024

Qwen2-VL lm_head Weight Sanitization Prefixes #61

Closed

Blaizzy added 7 commits September 29, 2024 19:46

remove torch and mlx-lm

607b249

add peft model creation

5c135ac

use tree flatten

534f20c

add dataset loader

c1edc22

fix dataset

91e9305

fix masks and rename dataset

e5c0424

support batch processing and train on completions

130d876

Blaizzy added 9 commits October 5, 2024 15:42

pin reqs

028e32c

add config dict condition

cd5ecf5

fix pixtral FT prompt

a116169

formatting images

d791dff

remove unused

c16c048

update trainer init

5a9c3db

update lora

97a4255

update md and formatting

0159020

bump version

0ec2412

Blaizzy changed the title ~~Trainer~~ Trainer + Multi image v0.1.0 Oct 5, 2024

Blaizzy and others added 15 commits October 6, 2024 14:38

add tests for pixtral and qwen2_vl

608adfc

add tests for pixtral

15962ec

Merge branch 'pc/tuner' of https://github.com/Blaizzy/mlx-vlm into pc…

d669fd1

…/tuner

Merge branch 'pc/tuner' of https://github.com/Blaizzy/mlx-vlm into pc…

b135eea

…/tuner

Merge branch 'main' into pc/tuner

98e0024

fix test

b7daf46

remove rope scaling

726faca

remove test args and update MD

a53fa13

format dataset defaults

31cdd67

add dataset formatting info

e33c0d2

fix styling

a9488bb

update model

87e598f

update default model

dde7390

rewrite comments

abbe83f

This was referenced Oct 11, 2024

Add support for Pixtral-12B #67

Merged

Feature Request: Add support for Pixtral and other Vision models (llama 3.2 11b/90b etc) lmstudio-ai/mlx-engine#5

Closed

Blaizzy merged commit ae66c0b into main Oct 11, 2024

Blaizzy deleted the pc/tuner branch October 25, 2024 19:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trainer + Multi image v0.1.0#41

Trainer + Multi image v0.1.0#41
Blaizzy merged 79 commits intomainfrom
pc/tuner

Blaizzy commented Jun 15, 2024 •

edited

Loading

Uh oh!

lin72h commented Jun 15, 2024

Uh oh!

Blaizzy commented Jun 15, 2024

Uh oh!

yukiarimo commented Sep 20, 2024

Uh oh!

Blaizzy commented Sep 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

Blaizzy commented Jun 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lin72h commented Jun 15, 2024

Uh oh!

Blaizzy commented Jun 15, 2024

Uh oh!

yukiarimo commented Sep 20, 2024

Uh oh!

Blaizzy commented Sep 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Blaizzy commented Jun 15, 2024 •

edited

Loading